fix(code-review): split cached token usage by alex-alecu · Pull Request #4086 · Kilo-Org/cloud

alex-alecu · 2026-06-18T07:42:48Z

Summary

Code reviews showed one large input number, which made cached work look like new work. This change separates input, output, and cached tokens in review comments and the review details page, while keeping old reviews readable when cache data is not available.

Verification

Open a completed review that has cached token data.
Check that its summary comment and details page show Input, Output, and Cached values.
Open an older completed review without cache data and check that Cached shows a dash.

Visual Changes

N/A

Reviewer Notes

Billing data is loaded only for finished reviews, so active review polling does not add billing queries. Stored input totals and retry limits keep their existing meaning, and no database change is needed.

kilo-code-bot · 2026-06-18T07:52:04Z

+  if (review?.model) {
+    return {
+      model: review.model,
+      tokensIn: review.total_tokens_in ?? null,


WARNING: "Input" token count has inconsistent semantics between the two return paths of getReviewUsageData.

The billing path (line 609) returns tokensIn: billing.totalUncachedTokens (cache-exclusive), so the footer/details render Input 200 · Cached 800. This fallback returns tokensIn: review.total_tokens_in, which is cache-inclusive: for v2 reviews it is back-filled from billing.totalTokensIn (the raw input_tokens sum, line 600), and for v1 reviews it is the SSE-accumulated prompt_tokens (cache-inclusive per OpenRouter convention, see processUsage.ts:755).

Consequently two reviews with identical actual usage render very differently depending on whether billing rows are still available: a new review shows Input 200 · Cached 800 while an older one shows Input 1000 · Cached —. Since the stated goal of this PR is to stop cached work from looking like new work, the fallback silently re-introduces that confusion for old reviews. Consider normalizing the fallback (e.g. only show total_tokens_in as a combined input when no cached breakdown exists, or relabel it Input (incl. cached)), so the "Input" label means the same thing across reviews.

Reply with @kilocode-bot fix it to have Kilo Code address this issue.

Sounds like something you want to fix?

kilo-code-bot · 2026-06-18T07:52:19Z

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Executive Summary

The incremental commit correctly bounds the review-usage lookup window and fixes model consistency, but the previously-flagged inconsistent "Input" token-count semantics between the billing and fallback paths remains unresolved.

Overview

Severity	Count
CRITICAL	0
WARNING	1
SUGGESTION	0

Issue Details (click to expand)

WARNING

File	Line	Issue
`apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts`	623	"Input" token count is cache-inclusive in the `review.model` fallback (`review.total_tokens_in`) but cache-exclusive in the billing path (`billing.totalUncachedTokens`), so the same label means different things across reviews

Incremental Review Notes

New commit 71b08f9ef ("fix(code-review): bound review usage lookup") reviewed against the previous review at fed3ff7d.

Verified correct:

getSessionUsageFromBilling now accepts reviewCompletedAt and applies lte(created_at, reviewCompletedAt), preventing later reviews that reuse the same CLI session from inflating a completed review's totals. The undefined guard preserves prior unbounded behavior for callers that don't pass a completion time.
Model handling changed to review.model ?? billing.model, and updateCodeReviewUsage only writes model when review.model == null, keeping stored and returned model consistent.
Router and callback route both pass review.completed_at ?? undefined.
New DB test (code-reviews.test.ts) covers the time-window exclusion with a realistic two-row fixture and asserts the expected cached/uncached token split.
No memory leaks: changes are pure additions to existing async functions; no new module-scope state introduced.

Not addressed (carried forward): the WARNING above on line 623. This commit only changed the model field and added the upper bound; the fallback path still returns review.total_tokens_in (cache-inclusive) while the billing path returns billing.totalUncachedTokens (cache-exclusive) under the same tokensIn label.

Files Reviewed (5 files in incremental diff)

apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts - 0 new issues (1 carried forward)
apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts - 0 issues
apps/web/src/lib/code-reviews/db/code-reviews.ts - 0 issues
apps/web/src/lib/code-reviews/db/code-reviews.test.ts - 0 issues
apps/web/src/routers/code-reviews/code-reviews-router.ts - 0 issues

Fix these issues in Kilo Cloud

Previous Review Summary (commit fed3ff7)

Current summary above is authoritative. Previous snapshots are kept for context only.

Previous review (commit `fed3ff7`)

Status: 1 Issue Found | Recommendation: Address before merge

Overview

Severity	Count
CRITICAL	0
WARNING	1
SUGGESTION	0

Issue Details (click to expand)

WARNING

File	Line	Issue
`apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts`	619	"Input" token count is cache-inclusive in the `review.model` fallback but cache-exclusive in the billing path, so the same label means different things across reviews

Files Reviewed (8 files)

apps/web/src/app/(app)/code-reviews/[reviewId]/CodeReviewDetailClient.tsx - 0 issues
apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts - 0 issues
apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts - 1 issue
apps/web/src/lib/code-reviews/db/code-reviews.ts - 0 issues
apps/web/src/lib/code-reviews/summary/usage-footer.test.ts - 0 issues
apps/web/src/lib/code-reviews/summary/usage-footer.ts - 0 issues
apps/web/src/routers/code-reviews-router.test.ts - 0 issues
apps/web/src/routers/code-reviews/code-reviews-router.ts - 0 issues

Notes

Verified microdollar_usage.input_tokens is cache-inclusive in this codebase (processUsage.ts:946 computes uncachedInputTokens = inputTokens - cacheHitTokens - cacheWriteTokens), so totalUncachedTokens = totalTokensIn - totalCachedTokens in getSessionUsageFromBilling is correct and consistent with existing convention.
The retry guard (failedSessionTokenCount / canRetryFailedSessionUsage) still uses totalTokensIn + totalTokensOut, so cache-inclusive totals and retry limits keep their existing meaning, as the PR description states. No behavior change there.
Billing is only loaded for terminal statuses (completed, failed, cancelled, interrupted) in the get router, so active-review polling does not add billing queries. Confirmed.
No memory leaks introduced: tokenCountFormatter is a module-scope singleton Intl.NumberFormat, and the router awaits getSessionUsageFromBilling via Promise.all (no floating promise).
usage-footer.ts has no server-only imports, so importing formatTokenCount into the client component is safe.

Fix these issues in Kilo Cloud

_{Reviewed by glm-5.2-20260616 · 865,208 tokens}

_{Review guidance: REVIEW.md from base branch main}

RSO · 2026-06-18T11:41:11Z

+  if (review?.model) {
+    return {
+      model: review.model,
+      tokensIn: review.total_tokens_in ?? null,


Sounds like something you want to fix?

RSO · 2026-06-18T11:46:09Z

-    expect(before.success).toBe(true);
-    expect(before.success ? before.attempts : []).toEqual([]);
+    if (!before.success) {
+      throw new Error('Expected successful code review get');


Why are you throwing in the test? Shouldn't this just be an assert?

RSO · 2026-06-18T11:49:54Z

+      const tokenUsage = billingUsage
+        ? {
+            input: billingUsage.totalUncachedTokens,
+            output: billingUsage.totalTokensOut,
+            cached: billingUsage.totalCachedTokens,
+          }
+        : {
+            input: review.total_tokens_in ?? null,
+            output: review.total_tokens_out ?? null,
+            cached: null,
+          };


What's the difference between billingUsage.totalUncachedTokens and review.total_tokens_in? Why is this distinction here?

fix(code-review): split cached token usage

fed3ff7

kilo-code-bot Bot reviewed Jun 18, 2026

View reviewed changes

fix(code-review): bound review usage lookup

71b08f9

RSO requested changes Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(code-review): split cached token usage#4086

fix(code-review): split cached token usage#4086
alex-alecu wants to merge 2 commits into
mainfrom
fix/code-review-token-breakdown

alex-alecu commented Jun 18, 2026

Uh oh!

kilo-code-bot Bot Jun 18, 2026

Uh oh!

RSO Jun 18, 2026

Uh oh!

kilo-code-bot Bot commented Jun 18, 2026 •

edited

Loading

WARNING

Previous review (commit `fed3ff7`)

Overview

WARNING

Notes

Uh oh!

RSO Jun 18, 2026

Uh oh!

RSO Jun 18, 2026

Uh oh!

RSO Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alex-alecu commented Jun 18, 2026

Summary

Verification

Visual Changes

Reviewer Notes

Uh oh!

kilo-code-bot Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

RSO Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Executive Summary

Overview

WARNING

Previous review (commit fed3ff7)

Overview

WARNING

Notes

Uh oh!

RSO Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

RSO Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

RSO Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kilo-code-bot Bot commented Jun 18, 2026 •

edited

Loading

Previous review (commit `fed3ff7`)